A New Dissimilarity Measure between Feature-Vectors

نویسنده

  • Liviu Octavian Mafteiu-Scai
چکیده

Distance measures is very important in some clustering and machine learning techniques. At present there are many such measures for determining the dissimilarity between the featurevectors, but it is very important to make a choice that depends on the problem to be solved. This paper proposes a simple but robust distance measure called Reference Distance Weighted, for calculating distance between feature-vectors with real values. The basic attribute that distinguishes it from other measures is that the distance is measured from one of the feature-vector, considered as a reference system, to other feature-vectors. In fact this reference vector belongs to a class of a classification system. A second distinctive attribute is that its value does not depend on the orders of magnitude of the different characteristics of vectors. In addition, through a parameter called factor of relevance, each feature receives a weight in terms of its influence, because different features have different influence on dissimilarity estimation depending on the final problem to be solved. An extension of the proposed distance allows working with hybrid vectors, ie real and logical values. Future research directions are also provided. General Terms Algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dissimilarity-Based Classification of Anatomical Tree Structures

A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment between the branch feature vectors representing those trees. Hereby, localized information in the branch...

متن کامل

Symmetric Distortion Measure for Speaker Recognition

We consider matching functions in vector quantization (VQ) based speaker recognition systems. In VQ-based systems, a speaker model consists of a small collection of representative vectors, and matching is performed by computing a dissimilarity value between the unknown speaker’s feature vectors and the speaker models. Typically, the average/total quantization error is used as the dissimilarity ...

متن کامل

Convex Optimizations for Distance Metric Learning and Pattern Classification

The goal of machine learning is to build automated systems that can classify and recognize complex patterns in data. Not surprisingly, the representation of the data plays an important role in determining what types of patterns can be automatically discovered. Many algorithms for machine learning assume that the data are represented as elements in a metric space. For example, in popular algorit...

متن کامل

Prototype Selection for Classification in Standard and Generalized Dissimilarity Spaces

A common way to represent patterns for recognition systems is by feature vectors lying in some space. If this representation is based only on the predefined object features, it is independent of the other objects. In contrast, a dissimilarity representation of objects takes into account the relations between them by some measure of resemblance (e.g. dissimilarity). The nearest neighbour (1-NN) ...

متن کامل

خوشه‌بندی داده‌های بیان‌ژنی توسط عدم تشابه جنگل تصادفی

Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013